refactor(run-mode): remove the in-session dispatch mechanism (#119)#124
Merged
Conversation
…-batch subcommands These fixture-swap/barrier subcommands exist only to serve the in-session (interactive Claude Code) isolated-run loop: they are referenced solely by in-session runbook generation and their own test modules, never by any Cli dispatch path. Removing them is the first step of dropping the in-session dispatch mechanism (#119). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make CLI dispatch the only mechanism end-to-end and delete all in-session
(`DispatchMechanism::InSession` / `RunMode::Interactive`) code. Claude Code now
defaults to and runs via its `claude -p` stream-json path, exactly like Codex's
`codex exec`; hybrid and headless are the only run modes (kept as accepted
`--run-mode` values pending the vocabulary collapse in #C).
Removed:
- `DispatchMechanism` enum, `RunMode::Interactive`, `RunMode::mechanism()`;
`default_for(ClaudeCode)` now → hybrid, `resolve_run_mode` ClaudeCode →
[hybrid, headless]; `HarnessRunCapabilities.mechanism` field.
- The single-shared-`env/` layout: every run now stages one
`env-<group>-<condition>/` per (group, condition).
- `HarnessAdapter::{runbook_template, parse_transcript, parse_transcript_full}`;
`parse_cli_events*` are now required (Codex parses its events file directly,
OpenCode errors until ingest is wired). The in-session runbook + its profile.
- In-session transcript code (`parse_transcript*`, subagent resolution by
`agent_description`, `resolve_subagents_dir*`, `SubagentMeta`/`SubagentEntry`)
and the `--subagents-dir` / `--session-id` flags + `CLAUDE_CODE_SESSION_ID`
auto-resolution. Shared `extract_invocations`/`read_records`/`TranscriptSummary`
kept (the stream-json parser reuses them).
All mechanism-keyed branches collapse to the CLI path; surviving harness-specific
behavior keys on the harness/adapter. Pipeline record-runs/fill-transcripts now
read each task's `outputs/<harness>-events.jsonl` only. Broken prose referencing
the deleted flags/commands rewritten in the manifest, help, and stray-writes hint;
the run-mode vocabulary/docs rewrite (incl. docs/harness-parity.md) stays with #C.
Integration tests re-pointed from the single `env/` to the per-(group,condition)
layout; in-session-only tests removed. `cargo test`, `cargo fmt --check`, and
`cargo clippy --all-targets -- -D warnings` are green; Claude Code `claude -p`
run → ingest → finalize smoke produces run.json/timing.json/benchmark.json.
Part of #113. Closes #119. Likely supersedes #34; unblocks #120 and #122.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #119. Part of the Instant magic epic (#113).
Makes CLI dispatch the only mechanism end-to-end and deletes all in-session (
DispatchMechanism::InSession/RunMode::Interactive) code. Claude Code now defaults to and runs via itsclaude -pstream-json path, exactly like Codex'scodex exec.hybridandheadlessare the only run modes (kept as accepted--run-modevalues pending the vocabulary collapse in #120).Net −2237 lines across 40 files — mostly deletion.
What changed
Commit 1 — delete the in-session-only subcommands.
switch-conditionandreset-batchwere referenced only by in-session runbook generation and their own tests.Commit 2 — remove the mechanism (atomic).
DispatchMechanismenum,RunMode::Interactive,RunMode::mechanism()removed;default_for(ClaudeCode)→ hybrid;resolve_run_modeClaudeCode →[hybrid, headless];HarnessRunCapabilities.mechanismdropped.env/layout gone — every run stages oneenv-<group>-<condition>/per(group, condition).HarnessAdapter:parse_cli_events*now required;runbook_template/parse_transcript*removed. Codex parses its events file directly; OpenCode errors until ingest is wired. Runbook always uses the shared headless template.agent_description,resolve_subagents_dir*,SubagentMeta/SubagentEntry), the--subagents-dir/--session-idflags, andCLAUDE_CODE_SESSION_IDauto-resolution. Sharedextract_invocations/read_records/TranscriptSummarykept (the stream-json parser reuses them).record-runs/fill-transcriptsread each task'soutputs/<harness>-events.jsonlonly.Per the agreed scope: dead flags removed outright; only broken prose refs fixed (manifest,
--help, stray-writes hint). The run-mode vocabulary/docs rewrite — includingdocs/harness-parity.md, which still describes the removed dispatch-mechanism API — is left for #120 / #121.Before / after
A bare
eval-magic run --harness claude-code(the default):iteration-N/env/iteration-N/env-<group>-<condition>/per pairenv/, agent-followediteration-N/RUNBOOK.md, human-followed CLI recipeclaude -pstream-json per taskCLAUDE_CODE_SESSION_IDoutputs/claude-events.jsonlVerification
cargo test— 468 pass, 0 fail (346 lib + 58 cli + 64 run + doc)cargo fmt --check— cleancargo clippy --all-targets -- -D warnings— cleangrepoversrc/— noInSession/DispatchMechanism/RunMode::Interactive/ deleted-flag referencesclaude -p):run→ simulated dispatch (claude-events.jsonl) →ingest(assembledrun.json+timing.jsonwithsource: transcriptfrom the stream-jsonresultevent) →finalize(wrotebenchmark.json)Follow-ups (out of scope here)
docs/harness-parity.mdrewrite owed by docs(harness): define the minimal harness interface + progressive-enhancement tiers #121.agent_descriptionis now written todispatch.jsonbut no longer read (kept to avoid schema churn here); a candidate for later cleanup.🤖 Generated with Claude Code